Grounding language in events
نویسنده
چکیده
Broadcast video and virtual environments are just two of the growing number of domains in which language is embedded in multiple modalities of rich non-linguistic information. Applications for such multimodal domains are often based on traditional natural language processing techniques that ignore the connection between words and the non-linguistic context in which they are used. This thesis describes a methodology for representing these connections in models which ground the meaning of words in representations of events. Incorporating these grounded language models with text-based techniques significantly improves the performance of three multimodal applications: natural language understanding in videogames, sports video search and automatic speech recognition. Two approaches to representing the structure of events are presented and used to model the meaning of words. In the domain of virtual game worlds, a hand-designed hierarchical behavior grammar is used to explicitly represent all the various actions that an agent can take in a virtual world. This grammar is used to interpret events by parsing sequences of observed actions in order to generate hierarchical event structures. In the noisier and more open-ended domain of broadcast sports video, hierarchical temporal patterns are automatically mined from large corpora of unlabeled video data. The structure of events in video is represented by vectors of these hierarchical patterns. Grounded language models are encoded using Hierarchical Bayesian models to represent the probability of words given elements of these event structures. These grounded language models are used to incorporate non-linguistic information into text-based approaches to multimodal applications. In the virtual game domain, this non-linguistic information improves natural language understanding for a virtual agent by nearly 10% and cuts in half the negative effects of noise caused by automatic speech recognition. For broadcast video of baseball and American football, video search systems that incorporate grounded language models are shown to perform up to 33% better than text-based systems. Further, systems for recognizing speech in baseball video that use grounded language models show 25% greater word accuracy than traditional systems.
منابع مشابه
Multi-Resolution Language Grounding with Weak Supervision
Language is given meaning through its correspondence with a world representation. This correspondence can be at multiple levels of granularity or resolutions. In this paper, we introduce an approach to multi-resolution language grounding in the extremely challenging domain of professional soccer commentaries. We define and optimize a factored objective function that allows us to leverage discou...
متن کاملTransmission of Ideology through Translation: A Critical Discourse Analysis of Chomsky’s “Media Control” and its Persian Translations
Among factors that might manipulate translators’ mind while producing a text is the notion of ideology transmission through text or talk. Adopting Critical Discourse Analysis (CDA) with particular emphasis on the framework of Van Dijk (1999), the present investigation is an attempt to shed light on the relationship between language and ideology involved in translation in general, and more speci...
متن کاملApproaching the Symbol Grounding Problem with Probabilistic Graphical Models
In order for robots to engage in dialog with human teammates, they must have the ability to identify correspondences between elements of language and aspects of the external world. A solution to this symbol grounding problem (Harnad, 1990) would enable a robot to interpret commands such as “Drive over to receiving and pick up the tire pallet.” This article describes several of our results that ...
متن کاملBack-flashover Investigation of HV Transmission Lines Using Transient Modeling of the Grounding Systems
The article presents the transients analysis of the substation grounding systems and transmission line tower footing resistances which can affect to the back-flashover (BF) or overvoltage across insulator chain in an HV power systems by using EMTP-RV software. The related transient modeling of the grounding systems is based on a transmission line (TL) model with considering the soil ionization....
متن کاملShared grounding of event descriptions by autonomous robots
The paper describes a system for open-ended communication by autonomous robots about event descriptions anchored in reality through the robot’s sensori-motor apparatus. The events are dynamic and agents must continually track changing situations at multiple levels of detail through their vision system. We are specifically concerned with the question how grounding can become shared through the u...
متن کاملImpact of Ocean-Land Mixed Propagation Path on Equivalent Circuit of Grounding Rods
In this paper, the effect of ocean-land mixed propagation path on the lightning performance of grounding rods is investigated. This effect is focused on two problems. The first is extracting exact equivalent circuit of grounding rods in the presence of oceans. The equivalent circuit can be used in transient analysis of power systems in the neighboring oceans. In the second one, this effect on t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008